Our microservices have several endpoints consuming memory or CPU. Depending on the number of iterations and bites we pass as parameters to these endpoints, it can take significantly longer to respond. We are going to explore the problem by looking at CPU and memory metrics.
CPU
When you know CPU usage you are better prepared to answer the following questions:
-
Is the amount of CPU resources maxed out?
-
Have I over provisioned the amount of CPU resources?
-
What does the baseline of usage looks like?
-
Is there room to grow without scaling out or up?
-
How much of the available CPU resources is it really using?
-
What type of load is it?
Memory
When you know memory usage you are better prepared to answer the following questions:
-
Is the amount of memory used close to the maximum available memory?
-
Have I over provisioned the amount of memory resources?
-
What does the baseline of usage look like?
-
Is there room to grow without scaling out or up?
-
How much of the available memory resources is it really using?
Monitoring
One of the first thing you usually want to do once your application is deployed is to configure monitoring. We’ll use the Azure portal to create a nice dashboard for monitoring our application metrics.
Open the Azure Portal and navigate to the resource group rg-java-runtimes you created for your application.
Select the quarkus-app container app, then select Metrics from the left menu, under the Monitoring group.
Using the Metrics panel, you can select which metrics you want to observe, and the time range for the data. Under the Standard Metrics namespace, you can see the list of available built-in metrics for your application. You can also create your own custom metrics.
Select "CPU Usage" from the list of metrics, choose "Max" for the Aggregation and select the last 30 minutes for the time range (in the top right of the panel). You can see the CPU and memory usage for your application.
Creating charts for all services
Now let’s add the CPU metrics of the Micronaut and Spring apps to the same chart.
Select Add metric and select "CPU Usage" and aggregation "Max" again.
Then click on the Scope setting, unselect quarkus-app and pick the micronaut-app.
Select Apply, and repeat the same for the spring-app.
Now you should see a nice chart with the CPU usage of all three applications. Let’s save this chart on the dashboard!
Select Save to dashboard and choose Pin to dashboard:
In the Pin to dashboard dialog, select Create new and give it a name, for example "Java Runtimes" then click Create and pin.
We’ll also add charts to monitor the memory usage and number of replicas of our applications.
Click on +New chart (top left corner) and repeat the same steps as before to create a chart with "Memory Working Set Bytes" metrics with aggregation "Max". Do this for the three applications (Quarkus, Micronaut and Spring Boot). Select Save to dashboard, choose Pin to dashboard, and select the existing "Java Runtimes" dashboard.
Click on +New chart again, and create another chart with the "Replica Count" metric with aggregation "Max". Don’t forget to save this dashboard as well (select Save to dashboard, choose Pin to dashboard, and select the existing "Java Runtimes" dashboard).
When you’re finished, in the Azure Portal sidebar select Dashboards and choose the "Java Runtimes" dashboard we created earlier.
|
Depending on your setup, you might see the Azure Portal sidebar menu or not. If not, click on the burger menu at the very top left of the portal and you will see the Dashboard menu. |
You should now see the 3 charts you just created. You can rearrange the charts by dragging them around, and resize them by dragging the bottom right corner.
Logs
The Azure Portal also provides a nice interface to view the logs of your application. This is crucial to troubleshoot issues when something goes wrong.
You have multiple options to view the different logs of your application:
-
You can connect to a given container apps instance and gets the stream of console logs. This is useful to troubleshoot issues with a specific instance of your application in real time.
-
You can also access the console logs in Azure Portal. You get access to all logs from the Log Analytics workspace we created earlier. Using SQL-like queries (called Kusto Query Language, or KQL), you can filter the logs and get the information you need across all your applications, revisions, and instances.
-
Finally, you also have access to the system logs, which are the logs of the container host. It’s very useful to troubleshoot issues with the container host itself and find out why your container is not running.
Streaming logs of a container instance
The most straightforward way to access the logs of your application is to connect to a given container instance and get the stream of console logs. You can directly access the stream of logs from the latest revision of a running instance of your application with this command:
az containerapp logs show \
--name "$QUARKUS_APP" \
--resource-group "$RESOURCE_GROUP" \
--format text \
--follow
|
Don’t forget that by default, containers apps scale out to 0 instances when they are not used.
You can use the command |
Once you’re connected, you can see the logs of your application in real time.
The --follow option keeps the connection open to see the new logs in real time.
In a different terminal, you can use the curl command to generate some load on the application.
You should see the logs of the application being updated in real time.
curl https://${QUARKUS_HOST}/quarkus/cpu?iterations=10
Viewing the console logs in Azure portal
You can also access the console logs in the Azure Portal or via CLI. Using Log Analytics, you get access to all logs from all your applications, revisions, and instances. This means you can troubleshoot issues across all your applications, tracing requests as needed, which is crucial if your application use a microservices architecture.
Open the Azure Portal and navigate to the resource group rg-java-runtimes you created for your application.
Select the logs-java-runtimes container app, then select Logs from the left menu, under the General group.
By default, you are presented a list of pre-defined queries. Close this panel by clicking on the X button on the top right corner, as we’ll create our own query.
Log analytics queries use the Kusto query language, which is a SQL-like language. You can use the query language to filter the logs and get the information you need.
Let’s start by creating a query to get the logs of the quarkus-app container app.
Enter this query in the editor:
ContainerAppConsoleLogs_CL
| where RevisionName_s == "quarkus-app--<REVISION_ID>"
You can get the app revision name by running the following command:
az containerapp revision list \
--name "$QUARKUS_APP" \
--resource-group "$RESOURCE_GROUP" \
--query "[0].name" --output tsv
Select Run to execute the query.
You should see the logs of the quarkus-app container app.
For now, it’s not very useful as it’s the same logs we saw in the previous section. Let’s add some filters to the query to search for error messages from the logs:
ContainerAppConsoleLogs_CL
| where RevisionName_s == "quarkus-app--<REVISION_ID>"
| where Log_s !has "INFO"
| where Log_s contains "error"
Select Run to execute the query. If your application is working fine, you should not see any results. Let’s generate some errors by crashing the application with the following command:
curl https://${QUARKUS_HOST}/quarkus/memory?bites=1000
Oops! We’re trying to allocate more memory than the container has available, resulting in a crash because of our (crude) memory allocation algorithm.
If you run the query again, you should see the error message in the logs:
|
Of course, you can go much further with the query language. You can have a look at the quick reference to play a bit with the queries. |
We can make it easier to read by making the latest logs appear at the top of the results, and only show the time and message of the last 10 logs:
ContainerAppConsoleLogs_CL
| where RevisionName_s == "quarkus-app--<REVISION_ID>"
| where Log_s !has "INFO"
| where Log_s contains "error"
| project TimeGenerated, Log_s
| sort by TimeGenerated desc
| take 10
And here we can quickly see our OutOfMemoryError error message.
We can save the query for later use by clicking on the Save button. You can then give it a name and a description, and a category to quickly find it later.
Viewing the console logs with the CLI
You can also view the logs via a CLI command instead of using the portal.
It’s just a matter of writing any KQL query in the analytics-query parameter as follow:
az monitor log-analytics query \
--workspace $LOG_ANALYTICS_WORKSPACE_CLIENT_ID \
--analytics-query "ContainerAppConsoleLogs_CL | where RevisionName_s == 'quarkus-app--<REVISION_ID>'" \
--output table
Viewing the system logs
While console logs are very useful to troubleshoot issues with your application, it won’t be much help if your container is not running. In this case, you need to troubleshoot the container host itself. This is where the system logs come in handy.
The system logs are the logs of the container host. They are very useful to troubleshoot issues with the container host itself and monitor its activity during provisioning operations.
System logs can be accessed the same way as console logs. Change the previous query in the editor to get the system logs:
ContainerAppSystemLogs_CL
| where RevisionName_s == "quarkus-app--<REVISION_ID>"
| project TimeGenerated, Reason_s, Log_s
| sort by TimeGenerated desc
You should see the system logs of the quarkus-app container app, covering the whole lifecycle of the container.
Using the CLI you can execute:
az monitor log-analytics query \
--workspace $LOG_ANALYTICS_WORKSPACE_CLIENT_ID \
--analytics-query "ContainerAppSystemLogs_CL | where RevisionName_s == 'quarkus-app--<REVISION_ID>' | project TimeGenerated, Reason_s, Log_s | sort by TimeGenerated desc" \
--output table
Load Testing
Now it’s time to add some load to the application. This will allow us to see how the auto-scaling features works in Azure Container Apps.
To add some load to an application, you can do it locally using JMeter, but you can also do it remotely on Azure using Azure Load Testing and JMeter.
Azure Load Testing is a fully managed load-testing service built for Azure that makes it easy to generate high-scale load and identify app performance bottlenecks. It is available on the Azure Marketplace.
Setting up Azure Load Testing
To use Azure Load Testing, go to the Azure Portal, select Create a resource in the sidebar and search for "Azure Load Testing".
1. Select Create:
2.
In the Resource group field, select the rg-java-runtimes that we created previously.
3.
Set the name lt-java-runtimes-<Your unique identifier> for the load testing instance (the name has to be unique).
4. Set the location to match our previously created resources (East US).
5. Select Review + Create, then Create.
Creating a load testing resource can take a few moment. Once created, you should see the Azure Load Testing available in your resource group:
Select lt-java-runtimes, and then click on Tests and then Create.
1. You can either create a quick load test using a wizard, or create a load test using a JMeter script. Choose this second option.
2. Before uploading a JMeter script, create a load test by entering a name (eg. "Add some load"), a description and click on the "Test plan" tab:
3.
Now that you are on the "Test plan" tab, you can upload the JMeter file (located under scripts/jmeter/src/test/jmeter/load.jmx) as well as the user.properties file.
The JMeter file sets up a load campaign targeting the various endpoints.
But before uploading the user.properties file, make sure you change the properties (QUARKUS_HOST, SPRING_HOST and MICRONAUT_HOST) to match your deployed applications:
LOOPS=50
THREADS=16
RAMP=1
CPU_ITERATIONS=5
MEMORY_BITES=20
# Put your quarkus host here
QUARKUS_HOST=<Your QUARKUS_HOST>
QUARKUS_PROTOCOL=https
QUARKUS_PORT=443
# Put your spring host here
SPRING_HOST=<Your SPRING_HOST>
SPRING_PROTOCOL=https
SPRING_PORT=443
# Put your micronaut host here
MICRONAUT_PROTOCOL=https
MICRONAUT_HOST=<Your MICRONAUT_HOST>
MICRONAUT_PORT=443
4.
Select Upload, and choose "User properties" in the File relevance field of the user.properties file.
5. Select Review + Create, then Create.
Running the tests
After the test creation, it will start automatically after a short time. When the test run finishes, you will get some metrics:
In the Azure Portal sidebar, select Dashboards and go back to the "Java Runtimes" dashboard we created earlier.
|
There might be a slight delay before the metrics are visible in the dashboard. |
If you take a look at the charts, you can see CPU and memory usage increase, and also that the number of replicas has increased from 1 replica to 10. Azure Container Apps has scaled automatically the application depending on the load.
Scaling
Now that we have seen the auto-scaling in action, let’s dive a bit into fine-tuning its configuration.
|
Azure Container Apps only support automatic horizontal scaling, meaning that it will only scale the number of replicas of your application. Vertical scaling (increasing the amount of available CPU and memory) is supported, but you need to do it manually. |
Azure Container Apps support different types of scaling rules, implemented using the KEDA Scaling Object. It supports the following triggers:
-
HTTP traffic: Scaling based on the number of concurrent HTTP requests to your revision. This is the default scaling rule.
-
TCP traffic: Scaling based on the number of concurrent TCP requests to your revision.
-
Event-driven: Event-based triggers such as messages in an Azure Service Bus.
-
CPU or Memory usage: Scaling based on the amount of CPU or memory consumed by a replica.
As our applications provides endpoints to load either the CPU or the memory, we will explore usage of the CPU and Memory usage triggers to scale our application.
Scaling based on CPU usage
To scale based on CPU usage, we need to update the scale rule of the application to use the cpu trigger.
This will create a new revision of the application (but the URL $QUARKUS_HOST remains unchanged), and will start a new deployment.
We will set a new scale rule for our Quarkus app using the Azure CLI:
az containerapp update \
--name "$QUARKUS_APP" \
--resource-group "$RESOURCE_GROUP" \
--scale-rule-name "cpu-scaling" \
--scale-rule-type "cpu" \
--scale-rule-metadata type=Utilization value=10 \
--min-replicas 1 \
--max-replicas 10
This will automatically scale out the application when the CPU usage is above 10% (we set it low deliberately to make it easy to go up).
Go back to the Azure Portal, and search for lt-java-runtimes to open again our load testing instance.
Select Tests in the left sidebar, open the test we created earlier and select Run to run the load tests again.
Once the test is finished, go back to the "Java Runtimes" dashboard and take a look at the number of replicas chart again. You should see that the number of replicas has increased to 10, and that the CPU usage has increased as well.
|
When using either the CPU or Memory usage triggers, the minimum number of replicas will always be 1. Use HTTP or event based triggers to allow scaling to 0. |
Scaling based on memory usage
Another option that we can use is to scale based on the memory usage, with the memory trigger.
This we will set the scale rule for our Micronaut app using the command:
az containerapp update \
--name "$MICRONAUT_APP" \
--resource-group "$RESOURCE_GROUP" \
--scale-rule-name "memory-scaling" \
--scale-rule-type "memory" \
--scale-rule-metadata type=Utilization value=15 \
--min-replicas 1 \
--max-replicas 10
This will automatically scale out the application when the memory usage is above 15% (we set it low deliberately to make it easy to go up).
Again, go back to the Azure Portal and run the load tests again. Open the dashboard when the test is finished, and take a look at the number of replicas chart again.
You can now compare how the CPU (Quarkus), memory (Micronaut), and HTTP (Spring) triggers behave when scaling the application, under the same load.
As you can see, using different scaling triggers allows to tune the scaling behavior of your application, depending on the type of load you want to handle. Note that you’re not limited to only one scaling trigger, you can use multiple triggers at the same time.
|
Fine tuning the scaling rules is a key factor to get the best performance/cost ratio for your application. You want to make sure that you don’t scale too early, and that you don’t scale too much to avoid paying for resources that are not needed. |
Checking the Metrics in the Database
Remember that we have a PostgreSQL Database with three tables where we store our metrics. You can execute the following SQL statements so you get all the metrics for Quarkus, Micronaut and Spring Boot.
az postgres flexible-server execute \
--name "$POSTGRES_DB" \
--admin-user "$POSTGRES_DB_ADMIN" \
--admin-password "$POSTGRES_DB_PWD" \
--database-name "$POSTGRES_DB_SCHEMA" \
--querytext "select Duration, Parameter, Description from Statistics_Quarkus" \
--output table
az postgres flexible-server execute \
--name "$POSTGRES_DB" \
--admin-user "$POSTGRES_DB_ADMIN" \
--admin-password "$POSTGRES_DB_PWD" \
--database-name "$POSTGRES_DB_SCHEMA" \
--querytext "select Duration, Parameter, Description from Statistics_Micronaut" \
--output table
az postgres flexible-server execute \
--name "$POSTGRES_DB" \
--admin-user "$POSTGRES_DB_ADMIN" \
--admin-password "$POSTGRES_DB_PWD" \
--database-name "$POSTGRES_DB_SCHEMA" \
--querytext "select Duration, Parameter, Description from Statistics_Springboot" \
--output table